Using a Multimedia Parallel Corpus to Investigate English-Galician Subtitling
نویسنده
چکیده
This paper presents an ongoing research project that involves the compilation and exploitation of the multimedia corpus of subtitled films Veiga as a method to investigate the practice of English intralingual subtitling and English-Galician interlingual subtitling. Our project draws on recent work in corpus-based translation studies and its applications in the field of audiovisual translation and, more specifically, on the use and development of multimedia corpora to empirically investigate film discourse, translation and subtitling. The Veiga corpus has been developed under the broader framework of the CLUVI Parallel Corpus, although it transcends the only-text approach that characterizes the CLUVI various corpora, enabling users to access the corpus content, that is, the English and the Galician subtitles of twenty four English speaking audiovisual products, in their natural, multi-semiotic form. The paper discusses issues of corpus design, multimedia text processing (alignment and annotation) and data retrieval alongside questions about the potential uses of our multimedia parallel corpus in subtitling and translation practice, research and education.
منابع مشابه
A Multimedia Parallel Corpus of English-Galician Film Subtitling
In this paper, we present an ongoing research project focused on the building, processing and exploitation of a multimedia parallel corpus of English-Galician film subtitling, showing the TMXbased XML specification designed to encode both audiovisual features and translation alignments in the corpus, and the solutions adopted for making the data available over the web in multimedia format. 1998...
متن کاملStrategies Used in the Translation of Interlingual Subtitling
This study was an attempt to identify the interlingual strategies employed to translate English subtitles into Persian and to determine their frequency, as well. Contrary to many countries, subtitling is a new field in Iran. The study, a corpus-based, comparative, descriptive, non-judgmental analysis of an English-Persian parallel corpus, comprised English audio scripts of five movies of differ...
متن کاملSpoken to Spoken vs. Spoken to Written: Corpus Approach to Exploring Interpreting and Subtitling
issue of Polibits includes a selection of papers related to the topic of processing of semantic information. Processing of semantic information involves usage of methods and technologies that help machines to understand the meaning of information. These methods automatically perform analysis, extraction, generation, interpretation, and annotation of information contained on the Web, corpus, nat...
متن کاملParallel corpus-based bilingual terminology extraction
This paper presents a parallel corpora-based bilingual terminology extraction method based on the occurrence of bilingual morphosyntactic patterns in probabilistic translation dictionaries. We discuss an experiment focused on two language pairs – English-Galician and English-Portuguese, and show results which experimentally confirm the high degree of accuracy of the proposed extraction technique.
متن کاملBootstrapping a Portuguese WordNet from Galician, Spanish and English Wordnets
In this article we exploit the possibility on bootstrapping an European Portuguese WordNet from the English, Spanish and Galician wordnets using Probabilistic Translation Dictionaries automatically created from parallel corpora. The process generated a total of 56 770 synsets and 97 058 variants. An evaluation of the results using the Brazilian OpenWordNet-PT as a gold standard resulted on a pr...
متن کامل